On the Coverage of a Morphological Analyser based on "Svensk Ordbok" [A Dictionary of Swedish]
In the p ro ject a Lexicon-oriented Parser fo r Swedish a stem d ic tionary (Sågvall H ein & S jögreen 1991; S jögreen , forthcom .) covering the 58 ,536 en try lem m as o f Svensk O rdbok (1986) a long w ith a com plete in flectional g ram m ar o f S w edish (Sågvall H ein , fo rthcom .) w as genera ted . T h is language descrip tion to g eth er w ith the U ppsala Chart Processor, U C P (Sågvall H ein 1987) constitu te a m orpho log ical ana lyzer o f S w edish , hencefo rth referred to as SM U, sho rt fo r S w edish M orphology in the U C P fram ew ork. So far, there are n o w ord fo rm ation rules in the SM U gram m ar, and w ords ou tside the scope o f Svensk O rdbok d o n ’t get an analysis^. E ven though closed in its p resen t version , the coverage o f SM U is w ell-defined; p rio r to any p rocessing w e m ay consu lt Svensk O rdbok to find ou t fo r any w ord form w hether it will get an analysis o rn o t; the d ictionary p rov ides an in tu itive , fam iliar fo rm at through w hich w e m ay explore the (present) com petence o f the S M U an a ly ser w ithou t an y p rio r know ledge o f its fo rm alism s o r operation . SM U is also w ell-deH ned in the sense , tha t fo r an y o f its lem m as. Svensk O rdbok p rov ides links to the correspond ing lexem es (basic senses), and fo r each lexem e a definition. In o u r ongoing w ork on a m ach ine-trac tab le d ictionary fo r Sw edish , w e are app roach ing p rob lem s concern ing the d istinction betw een general and dom ain specific vocabu la ry , and the p resen t coverage o f SM U is o u r starting-po in t fo r de lim iting a general S w ed ish vocabu lary . F o r an evaluation o f the generality o f the d ictionary , the an a ly ser has been app lied to d iffe ren t se ts o f Sw edish text. F o r one o f them , consisting o f the 10,224 m ost frequen t ty p es o f th e 7 ,3 m illio n w ord new spaper corpus o f T h e L anguage B ank (G ellerstam 1989) the w ords ou tside he scope o f the analyser have been exam ined at som e detail. H ere w e w ill p resen t the resu lts ach ieved so far, and also d iscuss th e ir im pact on o u r con tinued w ork on the d ictionary . F irst, how ever, w e w ill b riefly characterize the S M U an a ly se r w ith regard to m orpho log ica l descrip tions, and d ictionary rep resen ta tion o f in flection .
